By: Zihua Lai & Daniela Quintero Narváez.

Introduction

The project aims to analyze the state-wise COVID-19 data in the United States, focusing on trends, comparisons, and correlations within the data. We will use the ggplot2 package in R for exploring and visualizing the data, which will result into meaningful insights about the pandemic’s impact in the country and at the state level.

The data was obtained from the official website of The COVID tracking project: https://covidtracking.com/data/download

Objectives:

  1. To explore and visualize state-wise trends across the country regarding COVID-19 cases, deaths, hospitalizations, and testing.

  2. To compare the pandemic’s impact across different states of the country.

  3. To explore the correlations between various indicators such as positive cases, testing rates, and hospitalizations.

  4. To identify patterns and outliers in the data.

Data-set description

Variable Description
date Date on which data was collected by The COVID Tracking Project.
state Two-letter abbreviation for the state or territory.
death Daily increase in hospitalizedCumulative, calculated from the previous day’s value.
deathConfirmed Total fatalities with confirmed COVID-19 case diagnosis.
deathIncrease Daily increase in death, calculated from the previous day’s value.
deathProbable Total fatalities with probable COVID-19 case diagnosis
hospitalized Total number of unique individuals who have ever been hospitalized with COVID-19.
hospitalizedCumulative Total number of individuals who have ever been hospitalized with COVID-19.
hospitalizedCurrently Individuals who are currently hospitalized with COVID-19.
hospitalizedIncrease Daily increase in hospitalizedCumulative, calculated from the previous day’s value.
inIcuCumulative Total number of individuals who have ever been hospitalized in the Intensive Care Unit with COVID-19
inIcuCurrently Individuals who are currently hospitalized in the Intensive Care Unit with COVID-19
negative Total number of unique people with a completed PCR test that returns negative.
negativeIncrease Daily increase on negative variable
negativeTestsAntibody The total number of completed antibody tests that return negative as reported by the state or territory.
negativeTestsPeopleAntibody The total number of unique people with completed antibody tests that return negative as reported by the state or territory.
negativeTestsViral Total number of completed PCR tests (or specimens tested) that return negative as reported by the state or territory
onVentilatorCumulative Total number of individuals who have ever been hospitalized under advanced ventilation with COVID-19.
onVentilatorCurrently Individuals who are currently hospitalized under advanced ventilation with COVID-19.
positive Total number of confirmed plus probable cases of COVID-19 reported by the state or territory
positiveCasesViral Total number of unique people with a positive PCR or other approved nucleic acid amplification test (NAAT), as reported by the state or territory.
postiveIncrease The daily increase in field positive, which measures cases(confirmed + probable) calculated based on the previous day’s value.
positiveScore
positiveTestsAntibody Total number of completed antibody tests that return positive as reported by the state or territory.
positiveTestsAntigen Total number of completed antigen tests that return positive as reported by the state or territory
positiveTestsPeopleAntibody The total number of unique people with completed antibody tests that return positive as reported by the state or territory.
positiveTestsPeopleAntigen Total number of unique people with a completed antigen test that returned positive as reported by the state or territory
positiveTestsViral Total number of completed PCR tests (or specimens tested) that return positive as reported by the state or territory
recovered Total number of people that are identified as recovered from COVID-19. Types of “recovered” cases include those who are discharged from hospitals, released from isolation, or those who have not been identified as fatalities after a number of days (30 or more) post disease onset.
totalTestEncountersViral Total number of people tested per day via PCR testing as reported by the state or territory.
totalTestEncountersViralIncrease Daily increase in totalTestEncountersViral
totalTestResults In most states, the totalTestResults field is currently computed by adding positive and negative values because, historically, some states do not report totals, and to work around different reporting cadences for cases and tests.
totalTestResultsIncrease Daily increase in totalTestResults, calculated from the previous day’s value. This calculation includes all the caveats associated with Total tests/totalTestResults, and it is recommended against using it at the state/territory level, so we will not be using it in this project.
totalTestsAntibody Total number of completed antibody tests as reported by the state or territory
totalTestAntigen Total number of completed antigen tests, as reported by the state or territory.
totalTestsPeopleAntibody The total number of unique people who have been tested at least once via antibody testing as reported by the state or territory.
totalTestsPeopleAntigen Total number of unique people who have been tested at least once via antigen testing, as reported by the state or territory
totalTestsPeopleViral Total number of unique people tested at least once via PCR testing, as reported by the state or territory.
totalTestsPeopleViralIncrease Daily increase in totalTestsPeopleViral
totalTestsViral Total number of PCR tests (or specimens tested) as reported by the state or territory.
totalTestsViralIncrease Daily increase in totalTestsViral
Data-set definition sourced from the CoVid-tracking website: https://covidtracking.com/about-data/data-definitions

Exploratory Analysis

     date              state               death         deathConfirmed 
 Length:20780       Length:20780       Min.   :    0.0   Min.   :    0  
 Class :character   Class :character   1st Qu.:  161.2   1st Qu.:  607  
 Mode  :character   Mode  :character   Median : 1108.0   Median : 2410  
                                       Mean   : 3682.2   Mean   : 3770  
                                       3rd Qu.: 4387.5   3rd Qu.: 5462  
                                       Max.   :54124.0   Max.   :21177  
                                       NA's   :850       NA's   :11358  
 deathIncrease     deathProbable     hospitalized     hospitalizedCumulative
 Min.   :-201.00   Min.   :   0.0   Min.   :    1.0   Min.   :    1.0       
 1st Qu.:   0.00   1st Qu.:  79.0   1st Qu.:  985.2   1st Qu.:  985.2       
 Median :   6.00   Median : 216.0   Median : 4472.0   Median : 4472.0       
 Mean   :  24.79   Mean   : 417.3   Mean   : 9262.8   Mean   : 9262.8       
 3rd Qu.:  24.00   3rd Qu.: 460.0   3rd Qu.:12248.5   3rd Qu.:12248.5       
 Max.   :2559.00   Max.   :2594.0   Max.   :82237.0   Max.   :82237.0       
                   NA's   :13187    NA's   :8398      NA's   :8398          
 hospitalizedCurrently hospitalizedIncrease inIcuCumulative inIcuCurrently  
 Min.   :    0.0       Min.   :-12257.00    Min.   :   6    Min.   :   0.0  
 1st Qu.:  166.5       1st Qu.:     0.00    1st Qu.: 501    1st Qu.:  60.0  
 Median :  531.0       Median :     0.00    Median :1295    Median : 172.0  
 Mean   : 1190.6       Mean   :    37.36    Mean   :1934    Mean   : 359.6  
 3rd Qu.: 1279.0       3rd Qu.:    36.00    3rd Qu.:2451    3rd Qu.: 380.0  
 Max.   :22851.0       Max.   : 16373.00    Max.   :9263    Max.   :5225.0  
 NA's   :3441                               NA's   :16991   NA's   :9144    
    negative        negativeIncrease    negativeTestsAntibody
 Min.   :       0   Min.   :-968686.0   Min.   :   587       
 1st Qu.:   53941   1st Qu.:      0.0   1st Qu.: 11242       
 Median :  305972   Median :    141.5   Median : 78888       
 Mean   :  848225   Mean   :   3589.1   Mean   :145581       
 3rd Qu.: 1056611   3rd Qu.:   3916.0   3rd Qu.:162926       
 Max.   :10186941   Max.   : 212974.0   Max.   :864153       
 NA's   :7490                           NA's   :19322        
 negativeTestsPeopleAntibody negativeTestsViral onVentilatorCumulative
 Min.   :     1              Min.   :       1   Min.   :  32.0        
 1st Qu.: 54874              1st Qu.:  303300   1st Qu.: 220.2        
 Median :100282              Median :  936600   Median : 412.0        
 Mean   :188711              Mean   : 1818574   Mean   : 574.7        
 3rd Qu.:261121              3rd Qu.: 2316865   3rd Qu.: 818.0        
 Max.   :816231              Max.   :16887410   Max.   :1533.0        
 NA's   :19808               NA's   :15756      NA's   :19490         
 onVentilatorCurrently    positive       positiveCasesViral positiveIncrease
 Min.   :   0.0        Min.   :      0   Min.   :      0    Min.   :-7757   
 1st Qu.:  29.0        1st Qu.:   5754   1st Qu.:  10376    1st Qu.:   65   
 Median :  86.0        Median :  46064   Median :  68442    Median :  435   
 Mean   : 151.6        Mean   : 165156   Mean   : 178662    Mean   : 1384   
 3rd Qu.: 185.0        3rd Qu.: 177958   3rd Qu.: 202425    3rd Qu.: 1335   
 Max.   :2425.0        Max.   :3501394   Max.   :3501394    Max.   :71734   
 NA's   :11654         NA's   :188       NA's   :6534                       
 positiveScore positiveTestsAntibody positiveTestsAntigen
 Min.   :0     Min.   :     0        Min.   :     0      
 1st Qu.:0     1st Qu.:   852        1st Qu.:  1085      
 Median :0     Median :  8624        Median : 13661      
 Mean   :0     Mean   : 19811        Mean   : 31837      
 3rd Qu.:0     3rd Qu.: 25900        3rd Qu.: 49010      
 Max.   :0     Max.   :190026        Max.   :211546      
               NA's   :17434         NA's   :18547       
 positiveTestsPeopleAntibody positiveTestsPeopleAntigen positiveTestsViral
 Min.   :     0              Min.   :    3              Min.   :      0   
 1st Qu.:  3156              1st Qu.: 2682              1st Qu.:  16159   
 Median : 11956              Median :17763              Median :  65359   
 Mean   : 20517              Mean   :25259              Mean   : 198500   
 3rd Qu.: 19059              3rd Qu.:47012              3rd Qu.: 224680   
 Max.   :178979              Max.   :81803              Max.   :2628176   
 NA's   :19686               NA's   :20147              NA's   :11822     
   recovered       totalTestEncountersViral totalTestEncountersViralIncrease
 Min.   :      2   Min.   :       0         Min.   :-16946                  
 1st Qu.:   3379   1st Qu.:  193794         1st Qu.:     0                  
 Median :  17618   Median :  905322         Median :     0                  
 Mean   :  94242   Mean   : 2702109         Mean   :  5578                  
 3rd Qu.:  93152   3rd Qu.: 2780542         3rd Qu.:     0                  
 Max.   :2502609   Max.   :39695100         Max.   :324671                  
 NA's   :8777      NA's   :15549                                            
 totalTestResults   totalTestResultsIncrease totalTestsAntibody
 Min.   :       0   Min.   :-130545          Min.   :      0   
 1st Qu.:  104050   1st Qu.:   1206          1st Qu.:  18965   
 Median :  655267   Median :   6125          Median :  84652   
 Mean   : 2186936   Mean   :  17508          Mean   : 163403   
 3rd Qu.: 2264766   3rd Qu.:  19086          3rd Qu.: 230011   
 Max.   :49646014   Max.   : 473076          Max.   :1054711   
 NA's   :166                                 NA's   :15991     
 totalTestsAntigen totalTestsPeopleAntibody totalTestsPeopleAntigen
 Min.   :      1   Min.   :     1           Min.   :     3         
 1st Qu.:  20047   1st Qu.: 54913           1st Qu.: 37676         
 Median : 123384   Median :103968           Median :144130         
 Mean   : 308920   Mean   :165432           Mean   :168188         
 3rd Qu.: 432727   3rd Qu.:183103           3rd Qu.:255251         
 Max.   :2664340   Max.   :995580           Max.   :580372         
 NA's   :17359     NA's   :18580            NA's   :19781          
 totalTestsPeopleViral totalTestsPeopleViralIncrease totalTestsViral   
 Min.   :       0      Min.   :-1043744              Min.   :       0  
 1st Qu.:  141470      1st Qu.:       0              1st Qu.:  132460  
 Median :  419372      Median :       0              Median :  731651  
 Mean   :  965011      Mean   :    2740              Mean   : 2304555  
 3rd Qu.: 1229298      3rd Qu.:    2478              3rd Qu.: 2496925  
 Max.   :11248247      Max.   :  820817              Max.   :49646014  
 NA's   :11583                                       NA's   :6264      
 totalTestsViralIncrease
 Min.   :-1154583       
 1st Qu.:       0       
 Median :    1896       
 Mean   :   12961       
 3rd Qu.:   12441       
 Max.   : 2164543       
                        

The summary of the data shows significant missing values for some variables, but due to the complexity of the attributes and the way the data was constructed we will not remove them nor impute them, as they will only make the analysis and visualizations inaccurate.

Additionally, it is necessary to categorize the variables properly before moving to the visualization and analysis

Data Exploration

        date state death deathConfirmed deathIncrease deathProbable
1 2021-03-07    AK   305             NA             0            NA
2 2021-03-07    AL 10148           7963            -1          2185
3 2021-03-07    AR  5319           4308            22          1011
4 2021-03-07    AS     0             NA             0            NA
5 2021-03-07    AZ 16328          14403             5          1925
6 2021-03-07    CA 54124             NA           258            NA
  hospitalized hospitalizedCumulative hospitalizedCurrently
1         1293                   1293                    33
2        45976                  45976                   494
3        14926                  14926                   335
4           NA                     NA                    NA
5        57907                  57907                   963
6           NA                     NA                  4291
  hospitalizedIncrease inIcuCumulative inIcuCurrently negative negativeIncrease
1                    0              NA             NA       NA                0
2                    0            2676             NA  1931711             2087
3                   11              NA            141  2480716             3267
4                    0              NA             NA     2140                0
5                   44              NA            273  3073010            13678
6                    0              NA           1159       NA                0
  negativeTestsAntibody negativeTestsPeopleAntibody negativeTestsViral
1                    NA                          NA            1660758
2                    NA                          NA                 NA
3                    NA                          NA            2480716
4                    NA                          NA                 NA
5                    NA                          NA                 NA
6                    NA                          NA                 NA
  onVentilatorCumulative onVentilatorCurrently positive positiveCasesViral
1                     NA                     2    56886                 NA
2                   1515                    NA   499819             392077
3                   1533                    65   324818             255726
4                     NA                    NA        0                  0
5                     NA                   143   826454             769935
6                     NA                    NA  3501394            3501394
  positiveIncrease positiveScore positiveTestsAntibody positiveTestsAntigen
1                0             0                    NA                   NA
2              408             0                    NA                   NA
3              165             0                    NA                   NA
4                0             0                    NA                   NA
5             1335             0                    NA                   NA
6             3816             0                    NA                   NA
  positiveTestsPeopleAntibody positiveTestsPeopleAntigen positiveTestsViral
1                          NA                         NA              68693
2                          NA                         NA                 NA
3                          NA                      81803                 NA
4                          NA                         NA                 NA
5                          NA                         NA                 NA
6                          NA                         NA                 NA
  recovered totalTestEncountersViral totalTestEncountersViralIncrease
1        NA                       NA                                0
2    295690                       NA                                0
3    315517                       NA                                0
4        NA                       NA                                0
5        NA                       NA                                0
6        NA                       NA                                0
  totalTestResults totalTestResultsIncrease totalTestsAntibody
1          1731628                        0                 NA
2          2323788                     2347                 NA
3          2736442                     3380                 NA
4             2140                        0                 NA
5          7908105                    45110             580569
6         49646014                   133186                 NA
  totalTestsAntigen totalTestsPeopleAntibody totalTestsPeopleAntigen
1                NA                       NA                      NA
2                NA                   119757                      NA
3                NA                       NA                  481311
4                NA                       NA                      NA
5                NA                   444089                      NA
6                NA                       NA                      NA
  totalTestsPeopleViral totalTestsPeopleViralIncrease totalTestsViral
1                    NA                             0         1731628
2               2323788                          2347              NA
3                    NA                             0         2736442
4                    NA                             0            2140
5               3842945                         14856         7908105
6                    NA                             0        49646014
  totalTestsViralIncrease
1                       0
2                       0
3                    3380
4                       0
5                   45110
6                  133186

Conclusions

Dashboard